Import Python Libraries:

Cleaning/Preprocessing Data:

CCRB Dataset Cleaning:
NYPD Arrest Dataset Cleaning:
Creating NYC Borough dataframe using BeautifulSoup and Regex

Creating a Dataframe containing the number of complaints per precinct in NYC:

Creating a function to add Borough names to our CCRB Dataset:

Finding rows for individual races, and combining them into a dataframe labeled "borough_df":

Exploring New York City Arrests/Crime Rate:

Read in NYPD Arrest 2019 CSV File:

Top Perptrator Gender Category

Top Perptrator Race Category

Top 10 Crimes Reported

Creating a new dataframe from our Arrest Dataset to have the same races as our Census Data:

Bar Graph of Most Common Race Arrested Based on Borough

Census Data Exploration

Using the requests library and BeautifulSoup to scrape the U.S. Census Bureau for New York City:

Arrest by Race and Borough

Creating a list of our boroughs, and empty lists for each item we are scraping from census:

Using a for loop, as well as Regex to scrape information from the NYC Census webpage:

Creating a Dataframe to contain all the scraped census information:

Analyzing the number of complaints per officer:

Creating Two Pie Charts to Display the Number of Officers vs. Complaints:

Histogram Showing the Breakdown of Officers with Complaints:

Heatmap Exploration of Arrests vs. Demographics

Created heatmap comparing arrest percentage by race to complaint percentage by race

Map of NYC Boroughs Using CCRB Data:

Writing our HTML response to a file labeled "Precinct_file.txt"

Opening our Precinct file and creating a BeautifulSoup Object:

Extracting portion of HTML code for Precinct Numbers:

Since the addresses on the page don't provide the precinct's city, state, or zip-code, we need to create another request for the individual hyper-links on each precinct.

The code below pulls the desired links for each of our precincts.

Creates a list called "html_list" and adds the html from our request object for each of the hyper-links.

Creates a list containing the city, state, and zip code for each precinct using the regex code below.

Pulls information from our Beautiful Soup object to give us the street address for each precinct.

Code using Regex to pull the precinct numbers from our precinct_number BeautifulSoup object.

Code using Regex that pulls the street addresses from our precinct_address BeautifulSoup object.

NOTE: there were issues with one address in particular due to the ampersand, which I have manually corrected as seen below.

Creates a dictionary containing the precinct numbers and their respective street addresses and creates a dataframe out of this dictionary.

Using the precinct hyper-links, we pulled the city,state, and zipcode for each and added them to this dataframe here.

Next, we add our two dataframes together to get full addresses for all of our NYPD precincts.

Using the Google API, we can loop through our precinct addresses and get their respective latitude and longitude coordinates.

Code below creates a new dataframe containing the latitude and longitude points from our Google API.

Code below concatenates our Precinct address dataframe with our new coordinate dataframe to complete our location dataframe.

Combine our Incidents dataframe with our precinct location dataframe to complete our dataset.

Creating a Map of NYC Showing the Number of Complaints in Folium

Unused Code/Graphs:

Stacked Bar Graph of Most Common Age Group Arrested Based on Borough

Bar Graph of Gender Arrested Based on Borough

Level of Offense Based on Borough

Filtering Arrests Determined as Felony, Misdemenor, and Violation

Pie Chart of Percentage of Each Level Offense for New York